Target grade: A
I decided to continue with the dataset I found for my Coding 3 final project, that is the UFO sightings data from Tidy Tuesday. If you are unfamiliar with this dataset, please find below a short description (copied from my Coding 3 submission).
The UFO sightings dataset has been featured on Tidy Tuesday on the 25th week of 2023. The dataset comes from the National UFO Reporting Centre (NUFORC), and has been enriched with data from https://sunrise-sunset.org/.
The dataset contains three tables:
the ufo_sightings table contains data (like shape,
duration, date and time, location) on every UFO sighting reported by
NUFORC;
the placestable contains data on the sighting
locations (like timezone, elevation, population, coordinates);
the day_parts_map table contains the beginning of
certain day parts (like sunrise, twilight) for every coordinates and
date - this table has been used by the Tidy Tuesday team to enrich the
ufo_sightings table with the day_part
variable, denoting which time of the day the sighting took place (thus
this table is not used in my analysis).
The dataset contains 96,429 sightings (most of which are from the US) and 14,417 places.
As most of the observations are from the US, I decided to only concentrate on these.
library(tidytuesdayR)
raw_data <- tidytuesdayR::tt_load('2023-06-20')
## ---- Compiling #TidyTuesday Information for 2023-06-20 ----
## --- There are 3 files available ---
##
##
## ── Downloading files ───────────────────────────────────────────────────────────
##
## 1 of 3: "ufo_sightings.csv"
## 2 of 3: "places.csv"
## 3 of 3: "day_parts_map.csv"
library(data.table)
ufo_sightings <- data.table(raw_data$`ufo_sightings`)
places <- data.table(raw_data$`places`)
rm(raw_data)
ufo_sightings <- ufo_sightings[country_code == 'US']
places <- places[country_code == 'US']
Before doing any plotting, I defined my custom theme used on all of my plots.
theme_custom <- function() {
theme(
text = element_text(family = 'serif'),
axis.text = element_text(color = 'darkgrey', size = 8),
axis.title = element_text(color = 'darkgoldenrod4', size = 10, face = 'bold'),
panel.background = element_rect(fill = 'beige'),
panel.border = element_blank(),
plot.title = element_text(color = 'darkgoldenrod4', size = 13, face = 'bold', hjust = 0.5),
plot.subtitle = element_text(color = 'darkgoldenrod4', size = 11, hjust = 0.5),
legend.position = 'top',
legend.title = element_text(color = 'darkgoldenrod4', size = 10),
legend.text = element_text(color = 'darkgoldenrod4', size = 8),
panel.grid.minor = element_blank(),
panel.grid.major = element_line(color = 'bisque2', linewidth = 0.25),
axis.ticks = element_blank()
)
}
To visualize the evolution of the number of sightings over time, I added the year of the sightings as a transition state, and I created an animated racing barchart. For better visibility, I only show the top 25 states in each year by the number of sightings.
I added the group aesthetic and the
ease_aes so that the transition between states truly
resembles a racing barchart.
p1_data <- ufo_sightings[year(reported_date_time) >= 2000, .N, by = .(year(reported_date_time), state)][order(year, -N)]
p1_data <- p1_data[, rank := order(-N), by = year][rank <= 25, ]
library(ggplot2)
library(gganimate)
p1 <- ggplot(p1_data, aes(N, reorder(factor(rank), -rank), fill = state, group = state)) +
geom_col(alpha = 0.8) +
geom_text(aes(label = state), hjust = 1.2, vjust = 0.5, size = 5, color = 'white') +
labs(title = 'US UFO sightings in {closest_state} per state (top 25)',
subtitle = 'Total sigthings in this year: {ufo_sightings[year(reported_date_time) == closest_state, .N]}',
x = 'Number of sightings',
y = 'Rank') +
theme_custom() +
theme(legend.position = 'none',
panel.grid.major.y = element_blank()) +
scale_fill_discrete() +
transition_states(year, transition_length = 3) +
ease_aes('linear')
animate(p1, duration = 24, fps = 30)